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ABSTRACT 

Weak-lensing shear estimates show a troublesome dependence on the apparent brightness of 
the galaxies used to measure the ellipticity: In several studies, the amplitude of the inferred 
shear falls sharply with decreasing source significance. This dependence limits the overall 
ability of upcoming large weak-lensing surveys to constrain cosmological parameters. 
We seek to provide a concise overview of the impact of pixel noise on weak-lensing measure- 
ments, covering the entire path from noisy images to shear estimates. We show that there are 
at least three distinct layers, where pixel noise not only obscures but biases the outcome of 
the measurements: 1) the propagation of pixel noise to the non-linear observable ellipticity; 2) 
the response of the shape-measurement methods to limited amount of information extractable 
from noisy images; and 3) the reaction of shear estimation statistics to the presence of noise 
and outliers in the measured ellipticities. 

We identify and discuss several fundamental problems and show that each of them is able 
to introduce biases in the range of a few tenths to a few percent for galaxies with typical 
significance levels. Furthermore, all of these biases do not only depend on the brightness of 
galaxies but also on their ellipticity, with more elliptical galaxies often being harder to mea- 
sure correctly. We also discuss existing possibilities to mitigate and novel ideas to avoid the 
biases induced by pixel noise. We present a new shear estimator that shows a more robust per- 
formance for noisy ellipticity samples. Finally, we release the open-source PYTHON code to 
predict and efficiently sample from the noisy ellipticity distribution and the shear estimators 
used in this work at this URL. 
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1 INTRODUCTION 

Current and in particular upcoming wide-field imaging surveys 
such as the Dark Energy Survey^, the Kilo Degree Survey^, the 
Hyper Suprime Camera Survey, Euclid'^ and the Wide-Field In- 
frared Survey Telescope"* require highly accurate shape measure- 
ment methods to reach the forecasted accuracy of cosmological pa- 
rameters constraints, e.g. for the energy density of matter Q,m, the 
normalization of the matter power spectrum erg, the Dark Energy 
equation-of-state parameter w and its variation with time. The cur- 
rently most demanding lensing method, the cosmic shear two-point 
correlation function, allows for multiplicative errors (defined as the 
deviation of actual shear from the measurement by a factor m), with 

* E-mail: melchior.12@osu.edu 

^ http : / /wwv] . da rkenergy survey .org/ 

^ http://kids. strw . leidenuniv . nl/ 

^ http://sci.esa.int/euclid 

^ http://wfirst.gsfc.nasa.gov/ 



|m| not larger than some per mille (e.g Huterer et al. 2006; Amara 
& Refregier 2008). But with increasing survey volumes also tra- 
ditionally less demanding techniques, such as stacked cluster lens- 
ing, will require |mj of order 1% (Weinberg et al. 2012). Shear 
measurement methods known so far can reach these requirement 
in certain regimes, for instance for well-resolved and bright galax- 
ies. However, they commonly struggle with small and especially 
with faint galaxies (Massey et al. 2007; Bridle et al. 2010; Kitch- 
ing et al. 2012). Many suggestions have been brought forward as to 
why prominent pixel noise hampers shape measurements, but often 
it was difficult to disentangle the causes from their consequences. 

Bernstein & Jarvis (2002) computed that this so-called noise 
rectification bias generally scales inversely with the second power 
of the object's significance. Hirata et al. (2004) obtained an analytic 
description of the dependence of this bias on the sizes of galax- 
ies, but this derivation only applies to their adaptive moment-based 
measurement method. Recently, Refregier et al. (2012) showed that 
biases in the maximum-likelihood estimators of model-fitting ap- 
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preaches are a direct consequence of the presence of non-hnear fit 
parameters. They also provide an analytic expression for the bias 
of the ellipticity estimate, valid for Gaussian-shaped galaxies and 
Point-spread functions. 

We seek to generalize and extend these previous findings. In 
particular, we choose an approach, which is as method-independent 
as possible, offering insights in the several ways in which pixel 
noise obscures the estimation of weak gravitational shear. 



Approach 

Throughout this work, we aim to identify conceptual problems V 
for the shape estimation task, which give rise to deviations of the 
measured source ellipticity e'('P) from the true ellipticity e. We 
assume a probabilistic approach, where we inspect the probability 
distribution of V under noise, p('P|i'), with v denoting some suit- 
able characterization of the significance of the measurement, and 
its consequence, the probability distribution of the measured ellip- 
ticity caused by V, 



Pv{e.'\ 



dVe{V)p{V\v) 



(1) 



The ellipticity distribution is thus given by the impact V has on e', 
weighted by the probability that V actually occurs in a measure- 
ment with significance v. The key assumption here is that we could 
measure the function e! iV) perfectly, i.e. without pixel noise, while 
the action of the noise is entirely contained in the width and shape 
of the probability distribution of V. 

In practice, both e.' iV) dindp{V\v) additionally depend on the 
apparent shape of the source, i.e. its intrinsic shape and the effects 
of the convolution with the Point-spread function. We therefore in- 
troduce a parameterization of the apparent source morphology with 
a parameter vector 6. 

Although the true ellipticity e could be regarded as one of 
these source parameters, we choose to make the dependency of e' 
on e explicit, such that our most general form of the ellipticity dis- 
tribution caused by V reads as 



pv{e'\e,e,v) ^ / dV e'{V\e,e)p(V\e,e,i 



(2) 



We can now define when we consider an ellipticity measurement to 
be unbiased by V, namely if 



ov{e.'\e,e,v)) 



(3) 



where the average is taken over independent noise realizations of 
images of identical galaxies, parameterized by (e, 9). In section 2 
and section 3 we are going to inspect cases, where, for different 
reasons, biases occur for fixed e, whereas in section 4 we are going 
to discuss the application of statistics to samples of noisy ellipticity 
estimates, whose true values are drawn from an underlying distri- 
bution p(e). We conclude in section 5. 



2 NON-LINEAR ERROR PROPAGATION 

An object's ellipticity is necessarily a non-linear quantity since any 
definition needs to invoke a ratio of the two key parameters of the 
geometric ellipse, its semi-major and semi-minor axes. As with any 
parameter that depends non-linearly on the data, even a symmet- 
ric distribution of noise for each data points translates into much 
more complicated, in general asymmetric and skewed, distribution 



of the ellipticity. Refregier et al. (2012) describe this generic prob- 
lem specifically for the case of model-dependent galaxy shape mea- 
surements and work out the bias on size and ellipticity of the best-fit 
model that occurs even if the functional form of the galactic shape 
is perfectly known. We extend their finding by providing a theoreti- 
cal form of the distribution of noisy ellipticity estimates that works 
for model- and moment-based approaches. The key ingredient of 
the derivation is the understanding that any attempt to measure the 
ellipticity is affected by the spurious ellipticity pattern of the noise 
realization recorded in the image, which is entirely describable by 
its moments, whereas spatial models for arbitrary noise configura- 
tions are not meaningful. 

The second-order moments of the light distribution /(x), cen- 
tered at X, 



d^x I{x.)[xi — Xi) [xj — Xj), 



(4) 



give rise to a number of ellipticity estimators, two of which are in 
widespread use, 

_ Qii - Q22 + 2iQi2 



and 

Qii + Q22 

Qii - Q22 + 2igi2 
Qii + Q22 + 2x/QiiQ22 - Q 



(5a) 



(5b) 



We review their advantages and disadvantages in Appendix A. In 
summary, e is, at least theoretically, an unbiased estimator of the 
shear, while x has a much more favorable distribution under noise, 
but requires higher-order corrections when used as shear estimator. 
Since we seek to obtain a distribution of ellipticities under noise, we 
choose X for the derivation, but continue to use e when referring to 
the proper ellipticity of an object. 

In presence of noise Equation 4 needs to be modified, 

Q-j = y^'x W(x-x)[/(x)+n(x)](x,-a;,) (6) 

where is a centered weight function and n is the (uncorrelated) 
noise term, which, in the background-dominated limit of faint ob- 
jects, can be assumed to be drawn from an uncorrelated Gaussian 
distribution with variance a^, 

n ~ A/'(0,a^), {n(x)n(x')) = cr^(5(x - x'). (7) 

Even though weight functions are only explicitly present in 
moment-based approaches, model-based approaches effectively 
weigh pixels according to their distance and the shape of the em- 
ployed model, so they too make use of a weight function. 

Since moments are linear in the image data, they inherit the 
Gaussian error distribution from n. By the same token, the alge- 
braic sum of moments defining the numerator and the denominator 
in Equation 5a are Gaussian distributed. But what about their ratio? 
It is basic knowledge in statistics that the ratio t of two uncorrelated 
Gaussian variates with mean of zero and unit variance, A/'(0, 1), is 
distributed according to the Cauchy distribution, 



C{t) = 



1 



(8) 



But for the ellipticity-measurement problem, the combination of 
moments defining x do not have zero mean nor are they uncor- 
related. It is obvious that at least the denominator of x does not 
vanish for a source with non-negative brightness distribution. Also, 
the origin of the correlation quickly becomes apparent, when we 
choose a frame such that |xl — Xi > 0- which can always be real- 
ized by a suitable rotation. We can write the mapping of (Qii , Q22) 
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Figure 1. CoiTelation coefficient p„ between noisy moments Q'jj^ and Q22 
(black line) and between the moment combinations 2 = Q'^^^ — Q'22 and 
w = Q'li + Q'22 fr^'^ iine) as a function of the ellipticity of the Gaussian 
weighting function. Due to p„ 0.325, the errors of w and z are different 
(their ratio: blue line). Dots indicate the measurement of these quantities 
in numerical tests with 10,000 noise realizations for each galaxy image. 
Variation of the size and radial profile of the weight function amounted to 
sub-percent changes on the quantities shown above. 



onto new variables (to, z), the denominator and numerator of Xi, 
as a linear operator 



M 



1 



(9) 



The covariance matrix of [w, z) is then given by 



MSii,22M 



'^11 + <^22 '^ll ^ '^22 \ 
+ 0-22/ 



„2 
0^22 



(10) 



where Sii,22 = Diag(crfi, (T22) denotes the (diagonal) covariance 
matrix of the moments Qu. Thus, the correlation between w and z. 



„2 
- <^22 



(11) 



only vanishes if the variances of the two moments Qu and Q22 
are equal, i.e. for circular galaxies. However, if the weight function 
W is adjusted to match the apparent shape of the galaxy, then the 
variances will generally be different. In our case with xi > 0, 
the pixels along the 1 -direction (the semi-major axis) have a larger 
weight than those perpendicular to it. It follows that an > a22. 
Thus, the variance of both w and z are driven by crn, and they 
become more and more correlated the larger xi gets. 

But this picture is not yet entirely complete because for Equa- 
tion 10 we assumed the moments Qn and Q22 to be uncorrelated, 
but for any given image they are determined by the same noise real- 
ization. Imagine a perfectly noise-free image, where we add a pos- 
itive noise fluctuation in only one pixel. According to Equation 6, 
both Q'li and Q22 will then be larger than their noise-free counter- 
parts since their distance- weighting factors {xi — Xi)^ are positive 
or zero. Effectively, the errors of these two moments are not inde- 
pendent. As we show in Appendix B in the case of elliptical objects 
with Gaussian radial profiles of width s, the covariance matrix of 
Qii and Q22 is given by 



Sll,22 ~ 



„2 „2 

2 2 

^12 <^22 



38° 



(l_s)3(l + ,)3 (l_,)(l + ,)5 



(12) 



Figure 2. Marsaglia distribution (Equation 15) of the ellipticity x of a 
galaxy with true ellipticity x = 0.88 (equivalent to e = 0.6, indicated 
by the vertical dotted line) as a function of image significance u. The cor- 
relation strength p = 0.993 is characteristic of this value of e (cf. Equa- 
tion 1 1 and Figure 1). The hatched region indicates the unphysical but pos- 
sible range of outliers with |x| ^ 1. 



which automatically implies that the correlation coefficient be- 
tween Qii and Q22, 



Pn 



■2 

0^12 



(13) 



0"llO-22 O 

independent of size and ellipticity of the object. Numerical tests 
showed that this correlation is largely unaffected even by changes 
to the radial profile of the weight function (see Figure 1). The con- 
sequence of this correlation is a modification of Equation 10, 



- (722 + 2p„ 0-11(722 



<T?1 



2 

- <^22 



0"ll + 0-22 - 



„2 
^22 



2p„crii(722 



(14) 

which does not alter p but the variances of w and z. The ratio 
(7z/(7u, is also shown in Figure 1, together with p. It is remark- 
able how strongly correlated the two moment combinations become 
even at modest ellipticities. Again, this result shows only marginal 
changes under variation of weighting function size or radial profile. 

We are now equipped with variances and correlation of w and 
z and want to know the distribution of the ratio t — Marsaglia 
(1965, 2006) proved that the ratio of correlated Gaussian variates 
w ~ N[p,w,o'w) and 2 ~ N[pz,(Jz) with correlation coefficient 
p is given by 



PM{t)^rf{r{s-t)), 
with constants defined as 

(7iu , Uz 

r = -^=^= and s — p — . 

(7^ \/l — p2 (7u, 

The function /(r) describes the probability distribution of 
witha;,j/~ A/'(0, 1): 



fir) 



7r(l + r2) 



TT 1 

1+2^6^ 



Erf 



where 



b + ar 



Pz/f^z — PPw/(^'u 



and h - 



Pw 
dm 



(15) 



(16) 



(17) 



(18) 



We will refer to Equation 15 as the Marsaglia distribution.^ For the 

^ Unsurprisingly, in the case of uncorrelated variates w,z ^ A/'CO, 1), the 
first term of Equation 17 recovers the Cauchy distribution of Equation 8. 
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Figure 3. Top: EUipticity bias b{x), defined as the difference between the 
mean of the Marsaglia distribution of Equation 1 5 and the ellipticity x, as 
a function of the proper ellipticity t for different significance levels (solid 
lines). The restriction of the integral to within |xl = 1 leads to stronger 
biases (dashed lines). Bottom: The fraction of measurements with |x'| > 1. 




Figure 4. Ellipticity distribution of GEMS galaxies from Haussler et al. 
(2007). Galaxies in the specified magnitude range are only considered if 
Galfit was able to fit a single-Sersic model. For moderate ellipticities, 
the distiibution is well desciibed by the Rayleigh function (solid line), the 
expected distribution for the absolute value of the two-dimensional, or com- 
plex, ellipticity if each ellipticity component followed a common Gaussian 
distribution (here: (Je = 0.33). The measured distribution lacks highly 
elliptical galaxies, either because galaxies with large ellipticity are not as 
abundant in nature as predicted by the Rayleigh distribution or because their 
measurement is more difficult (subsection 4. 1 gives a possible explanation). 



ellipticity distribution we only have to substitute 

Mm = Qii + Q22, = Qii — Q22 (19) 

and to take variances and correlation from Equation 14. 

A non-vanishing correlation p > has important conse- 
quences, foremost s > and thus a shift of the peak of the el- 
lipticity distribution towards higher values of x- In Figure 2, we 
show the distribution for a fixed ellipticity, i.e. fixed p, as a func- 
tion of the image significance v, varied in steps of one magnitude. 
We can see that even for fairly bright galaxies, a substantial shift 
occurs, which grows with decreasing v. Less elliptical galaxies ex- 
hibit a weaker but still noticeable shift of the peak. 

There are two other remarkable features of the Marsaglia dis- 
tribution. First, it is not only shifted but in general also skewed 
(this is true even for p = G) with a long tail towards lower val- 
ues of Ixl- It is thus not obvious whether the expectation value 
{PA/(x'k) ^1 deviates from constituting a bias according 
to Equation 3. The top panel of Figure 3 shows the integral over 
the entire distribution minus the true value of x as a function of e 
and different values of v (solid lines). As expected, there is almost 
no bias for bright objects (black solid line), even up to large ellip- 
ticities. But with increasing noise level, the bias becomes initially 
negative (long tail dominates) before it turns positive (shift of the 
peak dominates). That means, the simple fact that the ellipticity is 
related to the image data in a non-linear way constitutes a bias of 
remarkable amplitude and non-trivial behavior. 

Similar findings, namely a shift of the peak and a skewed dis- 
tribution, are reported by Refregier et al. (2012) and Kacprzak et al. 
(2012), but our derivation does not require us to adopting a model- 
fitting approach or Gaussian likelihoods. It is thus not restricted to 
a particular shape of the galaxy and can also deal with convolutions 
with arbitrary PSF shapes as long as the process can be described 
in terms of moments (cf. Melchior et al. 201 1). It is, however, not 
entirely obvious how to quantitatively compare to their distribution 
of maximum-likelihood estimators of e. 

^ For the entire paper, we use the definition of v from Erben et al. (2001). 



Second, the Marsaglia distribution it is not bound by |x| 1. 
In fact, it inherits the wide Cauchy-type wings and is thus capable 
of generating outliers with unbounded errors. Still, most errors are 
comparatively small such that outliers most often originate from 
galaxies with large initial ellipticities, rendering the outlier frac- 
tion strongly ellipticity dependent (bottom panel Figure 3). With 
increasing noise level, ever smaller ellipticities become possible 
outliers. The exclusion of outliers from the integral over the dis- 
tribution pm (x) leads to much stronger and mostly negative biases 
(dashed lines in the top panel) as the positive impact of the shifted 
peak becomes limited. How measurement codes and shear statistics 
respond to such outliers will be a reoccurring topic in the remainder 
of this work. 



3 MEASUREMENT-RELATED ELLIPTICITY BIASES 

As we laid out above, an ellipticity measurement cannot provide an 
unbiased result, simply because the error distribution is shifted and 
skewed for any ellipticity | e| > 0. Instead of requiring a shape mea- 
surement method to yield unbiased ellipticity estimates, we should 
rather require that it reproduces the theoretically expected noisy 
distribution. In this section, we argue that in general not even that 
is possible because the attempt to measure a shape necessarily con- 
tributes its own uncertainties and, for non-linear shape parameters 
such as the galaxy size, its own biases (Refregier et al. 2012). 

Even though the biases arising from the limited amount of in- 
formation extractable from a noisy image are highly specific to the 
measurement method employed, there are general problems affect- 
ing each method in a similar way. In this section, we highlight the 
most relevant of these problems and seek to describe their level of 
systematic contamination of the ellipticity estimates. The tests are 
all carried out with the moment-based method Deimos (Melchior 
et al. 2011), but great care has been taken to ensure that our ap- 
proach and conclusions we draw from the tests can be generalized 
to other methods. 

The test galaxy images follow the Sersic radial profile (Ser- 
sic 1968) with intrinsic parameters, including the ellipticity, taken 
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from model fits to galaxies in the GEMS Hubble Space Telescope 
survey (Haussler et al. 2007, see Figure 4). The galaxies are con- 
volved with circular PSFs of Moffat-type (Moffat 1 969). We choose 
two different PSF widths to mimic ground-based and space-based 
conditions. The pixel noise follows Equation 7 with two levels of 
pixel noise, the first one corresponds to optimistic weak-lensing 
conditions ofv = 35, defined as in Erben et al. (2001, Equation 16 
therein), while the second one is one magnitude fainter {v = 15). In 
both cases, the images exhibit prominent pixel noise, but the galax- 
ies remain clearly detectable. The ground-based PSF has a Moffat- 
index of 9, the galaxy resolution factor R2 ~ 0.4 (defined in Hirata 
et al. 2004, equation 8). The space-based PSF has a Moffat-index 
of 3, and the resolution is R2 ~ 0.7. 



3.1 Centroid errors 

Determining the accurate centroid x of an object is crucial for 
any further shape analysis. This is obvious for moment-based ap- 
proaches, for which the ellipticity definitions given in Equation 5 
are sensible only if the centroid is chosen such that the dipole mo- 
ment 



D^= cf X I{x){xi - Xi) =Oiori e {1,2}, 



(20) 



which therefore needs to be enforced by such methods. 

Analogously, model-fitting approaches for galaxies or stars 
rely on models with peaked light distributions and finite support. 
Due to the limited amount of information in the image data, such 
models are often derived from a radial profile p{r), whose radial 
coordinate undergoes an ellipticity transformation, rendering the 
light distribution axisymmetric. The radial coordinate is expressed 
relative to the centroid or the peak position of the light distribution. 



1 — ei —£2 
—62 1 + ei 



(x-x) 



(21) 



such that the estimation of the model parameters explicitly de- 
pends on the estimation of x. 

With non-vanishing pixel noise, any measured centroid posi- 
tion is to some degree inaccurate, giving rise to an error in the 
measured location. It is useful to limit the discussion to perfectly 
elliptical shapes, i.e. galaxies whose isophotes have the same cen- 
ter, orientation, and ellipticity. By doing so, we can introduce polar 
coordinates, — >■ (r-c,0c), and rotate again into a frame with 
only one non-vanishing ellipticity component (cf. Figure 5). Then, 
we can identify the centroid error with V in Equation 2, 



Pc{e'\e, 



drr 



7r/2 



t/2 



• e'('-c, (/>cje, e) p{rc, (pde, 6, v). 

(22) 

The top panel of Figure 6 shows the distribution of centroid 
errors under noise. In general, centroid errors for an elliptical light 
distribution show a similarly elliptical distribution, i.e. errors along 
the semi-major axis are larger than along the semi-minor axis where 
the the light distribution has a larger gradient (e.g. Lewis 2009). In 
the polar coordinate frame this translates into 



p{4>c 



oc Tc exp 



— COS 



(23) 



where we assumed an elliptical Gaussian distribution for p(As) 
with semi-major axis a and semi-minor axis b. Remarkable about 
this distribution is the enhanced probability of finding centroid er- 
rors with small ^c, and that this preferential alignment with the 




Figure 5. Sketch of a perfectly elliptical "galaxy" affected by a positive 
noi.se fluctuation (red area), causing a centroid offset of length at an 
angle ific from the semi-major axis. 



semi-major axis is most prominent for galaxies with large elliptici- 
ties. For this plot we chose the well-resolved galaxies with the high 
noise setting such as to emphasize the alignment of the centroid er- 
rors with the semi-major axis. For a ground-based instrument the 
distribution p{(j>c) would be much less peaked around zero for any 
Vc, and larger values of would be more likely. At lower noise, 
the distribution would be shifted towards smaller values of Vc- 

To assess the impact of miscentering, we artificially shift the 
centroid position assumed in our shape measurement code, and 
record the resulting deconvolved ellipticity in absence of any pixel 
noise. The middle panel of Figure 6 illustrates the impact of mis- 
centering on a typical disc-like galaxy as a function of the angle 
(fjc. Shown are the ellipticity estimates for four different intrin- 
sic ellipticities, with either space-based resolution (solid curves) 
or ground-based resolution (dashed curves). Several aspects of this 
plot are worth mentioning: Centroid shifts with small angles (j)c, i.e. 
aligned with the semi-major axis, shift the flux of the peak to non- 
vanishing distances without altering the orientation, thus leading to 
an increase of the ellipticity. On the opposite end, centroid shifts 
with large (jic change the perceived orientation away from the ac- 
tual one, thus lowering the inferred ellipticity. Furthermore, while 
over- and underestimate are fairly balanced for small e, galaxies 
with large ellipticity suffer much more strongly from miscentering. 
Finally, the effects are stronger for the ground-based case since the 
PSF deconvolution amplifies any ellipticity signal, both the true and 
the spurious one. This is additionally enhanced by the larger offsets 
Tc encountered for the ground-based resolution at fixed i/. 

The shape of the miscentering curves can be approximately 
described by 



\e\ + eo — £90 cos 



£90, 



(24) 



where the parameters eo,90 J? depend on rc (which in turn de- 
pends on u) and e, as well as other source parameters 6, most no- 
tably the width of the PSF, and the method employed. 

In the bottom panel of Figure 6 we show the ellipticity er- 
ror induced by miscentering. We can see that the overall effect of 
the miscentering is a small, but consistent underestimation of the 
inferred ellipticity, which scales linearly with e. This is not surpris- 
ing, since large ellipticities suffer more strongly from miscentering. 
This effect is partially compensated by the preferential alignment of 
the centroid with the semi-major axis such that the centroid errors 
are not entirely isotropic and the average error bias is smaller than 
what would naively be expected when only considering the middle 
panel of Figure 6. Qualitatively, this result is in good agreement 
with the derivation by Bernstein & Jarvis (2002, see their section 
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Figure 6. Top: Distribution of centroid offsets in polar (rc , flic ) coordinates 
for well-resolved galaxies with morphologies from the GEMS survey in 
images with high levels of pixel noise. Middle: Ellipticity eiTor as func- 
tion of the miscentering angle </ic for a noise-free galaxy image. The offsets 
Ta were set to the median values of the high-noise (y = 15) simulations. 
Bottom: Ellipticity error as function of intrinsic ellipticity for the same en- 
semble used in the top panel, seen with two different noise levels (dashed 
and solid lines) and with space-based or ground-based resolution (red and 
blue lines). Error bars denote l-cr errors of the mean ellipticity in each bin. 
The scatter at the high ellipticity end is mainly driven by the lower number 
of simulated galaxies (cf. Figure 4) 



8.2 and Equation 8.1) in that the bias depends linearly on e and 
scales approximately as v~^. However, we only find a mild depen- 
dence on the image resolution R. 



3.2 Misalignment 

In noisy images, methods that employ either an elliptical model or 
- in the case of moment-based approaches - an elliptical weight 
function are subject to random errors in the determination of the 
orientation. Like before, we choose a coordinate system aligned 
with the semi-major axis of the elliptical source and introduce the 




Figure 7. Sketch of a perfectly elliptical "galaxy" (solid) affected by a pos- 
itive noise fluctuation (red area), which gives rise to a misalignment of an 
elliptical model or weight function with respect to the true orientation by an 
angle 4>m (dotted). 



misalignment angle (cf. Figure 7). Identifying misalignment as 
the conceptual problem in Equation 2 leads to 



p,n{e'\e,9,u) = / d(f),n e'{(j),n\e,9) p{(j),n\e,9,i/), 



(25) 



where e'{(j)m\e, 9) denotes the ellipticity measurement performed 
with a potentially erroneous orientation. As indicated in Figure 7, 
we expect the measured shape of the object to be biased towards 
smaller sizes and ellipticities. Estimates with smaller sizes were in- 
deed reported by Kacprzak et al. (2012, Figure 1 therein) for model- 
fitting approaches. 

In Figure 8 we show - from top to bottom - the measured dis- 
tribution of misalignment angles 0™, from noisy images, the impact 
of misalignment on the ellipticity estimates in noise-free images, 
and the net effect of misalignment on the inferred ellipticity as a 
function of the true ellipticity. For these tests, (pm denotes the ori- 
entation of the weight function used for the measurement, but since 
the PSF is circular, the orientation is identical to the actual orien- 
tation of the galaxy. That means, e'{(f>rn) is obtained by rotating 
the correctly matched elliptical weight function by the angle (?!>m.^ 
Two essential finding can be made with this test. Pixel noise leads 
to an error on the orientation inferred from the image data, which 
becomes smaller with increasing ellipticity. This is not surprising 
since a circular source has a uniform distribution of orientation an- 
gles, while increasing the ellipticity renders the determination of 
the orientation easier, even in noisy images. The second finding is 
that for any \<l>m\ > an underestimation of e is observed with a 
characteristic shape that approximately follows the relation 

2/ 



- ego) COS 



+ £90, 



(26) 



with some ego that depends on the parameters 9 of the apparent 
galactic shape. Combining both findings, we obtain a low bias in- 
duced in noisy images caused by misalignment. Looking at the bot- 
tom panel of Figure 8, we can see that the effects of misalignment 
are more severe for modestly elliptical galaxies, where we observe 
an underestimation also in the moderate noise case with f — 35. 
This is an immediate consequence of the poor alignment constraints 
for galaxies with small |e|, and as misalignment is dependent on the 
apparent, i.e. convolved, ellipticity, ground-based imaging is more 
prone to this kind of bias. 

^ In order to single out the effect of misahgnment, we do not alter the size 
nor the total ellipticity of the weight function, even though these changes 
would occur in practice. 
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Figure 8. Top: Width of the orientation error angle (prn for the high noise 
level. For better visibility, the space-based data has been shifted horizon- 
tally by 0.01. En'or bars denote l-cr intervals. Middle: EiTor of the elliptic- 
ity when the weight function is rotated by the angle for four different 
galactic ellipticities. Bottom: Error of the ellipticity in noisy images caused 
by misalignment as a function of galactic ellipticity. Error bars denote the 
1-(T eiTors of the mean in each ellipticity bin. 



3.3 Method-related limitations 

So far, we have dealt with the problem of pixel noise assuming that 
a method for estimating the ellipticity is employed, whose validity 
does not become questionable in presence of noise. Unfortunately, 
we cannot expect shape measurement methods to be entirely indif- 
ferent to increasing amounts of noise. 

Moment-based methods, such as KSB (Kaiser et al. 1995), 
HOLICS (Okura & Futamase 2009) or Fdnt (Bernstein 2010), 
need to apply a weight function to the data to suppress the impact 
of pixel noise at large distance from the source. The application of 
the weight function leaves an imprint on the measured moments, 
which can be corrected, albeit only approximately. With increas- 
ing amplitude of the noise, the only way to limit the variance of 
the ellipticity estimates is to shrink the weight function. Then, the 



approximation for the weight function coiTection become increas- 
ingly inaccurate, leading to errors in the deweighted moments. The 
direction and amplitude of the errors are a function of the apparent 
galaxy shape, in particular its slope, and properties of the PSF (e.g. 
Melchior et al. 201 1, Figure 1 therein). 

Model-fitting approaches do not need an artificial weight func- 
tion as they make use of the compactness of their galaxy model, 
which is often related to the Sersic radial profile. One problematic 
aspect is the validity of this model - or model family - to faithfully 
describe the morphologies of all galaxies present in the observation. 
If data with a higher significance and possible also a higher spatial 
resolution is available, the model assumptions can be verified, but 
this is often unfeasible. Bayesian approaches in model-fitting (e.g. 
Lensfit, Miller et al. 2007; Kitching et al. 2008) additionally em- 
ploy priors on some parameters of the model, which are hard to 
estimate from the data alone. These priors are themselves derived 
from data of higher quality, and do not necessarily apply to the 
data at hand. But even if galaxies were purely elliptical - as many 
models assume - unbiased shear estimates require the radial profile 
to be accurately matched to the observed galaxies (Voigt & Bridle 
2010), which cannot be guaranteed with images of severely limited 
significance. 

Model assumptions could be avoided by using a decomposi- 
tion into complete basis function sets, such as shapelets (Refregier 
2003) or Sersiclets (Ngan et al. 2009). However, the pixel noise 
limits the number of modes used in the fit such that the resulting 
model becomes dominated by the shape of the zeroth order. In case 
of the circular shapelet basis function set, this introduces a bias to- 
wards circular objects (Melchior et al. 2010). For Sersiclets, which 
can be considered a generalization of shapelets, using a finite num- 
ber of modes leads to a relation between the slope of the radial 
profile and the spatial scale of higher-order fluctuations, which is 
not necessarily obeyed by observed galaxies (Andrae et al. 201 1). 

In summary, any method, which deals with severely degraded 
data invokes additional assumptions about the data, which may turn 
out to be wrong once an adequate assessment can be devised, e.g. 
with data of higher quality. It needs to be shown that the meth- 
ods employed are able to meet requirements demanded to reach 
the scientific goals of the project at hand. Therefore, simulations 
with simplified galaxy models (e.g. Heymans et al. 2006; Bridle 
et al. 2010; Kitching et al. 2012) provide a clean way of compar- 
ing several methods, but should be complemented by simulations 
with realistic galaxy morphologies (such as Massey et al. 2004; 
Meneghetti et al. 2008; Mandelbaum et al. 2012). 

Additional problems arise from the occurrence of outliers, i.e. 
|e'| ^ 1. From the discussion in section 2 and by looking at Fig- 
ure 3, we know that outliers must occur if the combination of appar- 
ent ellipticity and pixel noise exceeds some value. Moment-based 
methods will just let these outliers pass (unless they try to avoid 
them by shrinking the weight function) and expect a subsequent 
lensing analysis to deal with them. We revisit this problem in sub- 
section 4.1. On the other hand, by construction model-fitting meth- 
ods cannot have such outliers, but they will encounter catastrophic 
events, such as convergence failures, and flag these objects, which 
again excludes them from the analysis later on. 

Alternatively, model-fitting approaches might invoke a prior 
on the ellipticity, which could in principle completely prevent the 
occurrence of outliers or modeling failures. However, this comes at 
the price of essentially recovering the prior for data with very little 
constraining power on the ellipticity. As we said above, this prior 
does not necessarily describe the actual data accurately. Moreover, 
even if the prior is an accurate description of the ensemble ellip- 
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ticity distribution, the fact tiiat at fixed noise level outliers prefer- 
entially occur for galaxies with large ellipticity leads to a stronger 
impact of the prior on those galaxies. Since large ellipticities are 
less abundant than those around |e| = 0.3 (cf. Figure 4), we expect 
a bias towards the prior which grows with galaxy ellipticity. 

For years, attempts have been made to calibrate the bias away 
based on simulated images, e.g. by introducing global "fudge" fac- 
tors that boost the measured ellipticities (e.g. Heymans et al. 2006, 
Table Al therein). More recently, sophisticated supervised learning 
methods have been employed to correct for the bias as a function 
of several input parameters (Gruen et al. 2010; Tewes et al. 2012; 
Kacprzak et al. 2012). Although they acknowledge and account for 
the presence of biases in the measurements, irrespective of their 
origin, they hinge on quality, size, and representative nature of the 
training set. We thus regard these methods to work well for galaxies 
that are abundant in the training data (cf. Kitching et al. 2012) and 
whose properties accurately resemble those of actually observed 
galaxies. 



4 INADEQUATE SHEAR STATISTICS 

We now turn to a different type of systematic problem caused by the 
pixel noise, which does not occur at the level of individual elliptic- 
ity estimates but later on, namely when statistics of the measured 
ellipticity distribution are calculated. Pixel noise can render these 
statistics difficult to interpret or entirely inappropriate by violating 
their fundamental assumptions. One such assumption in virtually 
every lensing analysis is that a proper definition of ellipticity pro- 
vides an unbiased estimator of the (reduced) shear, 



(e) = j <fe ep(e) = g. 



(27) 



For a noise-free measurement of the ellipticity in the form of Equa- 
tion 5b, this is in fact the case (Seitz & Schneider 1997; Bartelmann 
& Schneider 2001). 

For what follows, we are going to assume the optimistic sce- 
nario, in which the measurement methods provide ellipticity esti- 
mates, which follow directly the Marsaglia distribution of Equa- 
tion 15. That means all complications discussed in section 3 are 
eliminated. In practice, this can be realized by using the sampling 
method outlined in Appendix B, which provides a realistic ellip- 
ticity distribution of Gaussian-shaped galaxies under noise but is 
not affected by shape-measurement biases. We recall the two main 
distinctions between the two popular moment-based ellipticity es- 
timators from Appendix A: e has a problematic distribution under 
noise (cf. Figure Al), while x depends on the shear in a non-linear 
way. Both features will prove to be problematic. 



4.1 Outliers and their rejection 

As we have stressed several times now, the pixel noise can lead to 
ellipticity outliers with |e'| ^ 1 (cf. Figure 3) and thus gives rise to 
a second population of ellipticity measurements, namely those on 
the unit circle in e-space. If the resulting sample is naively inserted 
in Equation 27, the presence of these outliers, whose position on the 
ring is still loosely correlated with their noise-free location, should 
lead to an overestimation of the inferred shear, simply because they 
all have unit ellipticity and thus large impact on the mean of the 
distribution. Weirdly, this is not observed in the top-left panel of 
Figure 9, where we show the mean ellipticity as a function of shear 
and noise level. If the shear only has one non- vanishing component. 
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Figure 9. Performance of shear statistics under pixel noise. Top left: Aver- 
age of the entire e distribution, following Equation 27. Top right: Same as in 
the left panel, but after rejection of outliers with |e| > 1. Bottom left: Non- 
linear solver for the shear from measurements of x, based on Equation 29. 
Bottom right: Linearized relation between x and the shear, following Equa- 
tion A2. For all panels, shear was only applied on 1-direction (solid lines; 
dashed lines in the top left panel: g2 = 0.1), and the ellipticity noise was 
of Marsaglia-type, simulated with the algorithm described in Appendix B. 
Means and errors are taken from 10 independent noise realization at each 
value of the shear, each realization comprised 10,000 samples. 



the population on the unit circle almost perfectly compensates the 
underestimation we expect from the mean of the Marsaglia distri- 
bution (cf. Figure 3). This balance is delicate, and it is quite possi- 
ble that it is not maintained for galaxies with a different radial pro- 
file, where the means and errors of the measured moments deviate 
from our Gaussian calculation. But even if the balance persisted, 
statistics other than the mean, e.g. the two-point correlation func- 
tion, will in general pick up the presence of outliers and perform 
erratically. Another concerning aspect of (e) as shear estimator be- 
comes apparent when the other shear component comes into play. 
Because the noise affects the outliers by altering their phase on 
the unit circle, the estimate g\ becomes dependent on g2 (dashed 
lines). The cross-talk between the two shear components is relevant 
for the faintest two noise settings, where the outliers constitute a 
significant portion of the entire sample.** 

On the other hand, excluding these outliers would be abso- 
lutely justified since any ellipticity definition needs to be bound 
by 1, otherwise the ratio of semi-minor to semi-major axis is 
non-sensical. Unfortunately, this commonly adopted approach also 
leads to biases because we now sample from 



Po{e'\e,e,v) = 



d n(e + n)p(n|e,6l,!/). (28) 



° The other estimators shown in Figure 9 do not suffer significantly from 
this sort of cross-talk since they either do not have outliers (top right panel) 
or the outlier population has a well-behaved shape (both estimators based 
on x). Hence, we only show the effect for (e) . 
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Even if p{n) were isotropic and liad zero mean, tlie truncation of 
the integration range at the ellipticity unit circle poses a bias for 
all galaxies with sufficiently large ellipticity. If p(n) is monotoni- 
cally decreasing with increasing |n| as it is common, this bias will 
be negative because the outlier-excluded distribution is more com- 
pact than the actual noisy distribution. In other words, this negative 
bias increases with ellipticity or, equivalently, with the probability 
of obtaining a noise contribution that can push the measured ellip- 
ticity beyond the unit circle. Consequently, sampling from Po{e'\ e) 
rather than from p(e) in Equation 27 results in a low bias on g since 
any coherent distortion of the ellipticity distribution will increase 
the probability of falling outside of the unit circle for all galaxies, 
whose intrinsic ellipticity is aligned with the additional distortion. 
Hence, this distortion becomes suppressed in the outlier-rejected 
sample. This is shown in the top-right panel of Figure 9. The bias 
becomes more prominent with increasing shear since large elliptic- 
ities become more abundant and the entire distribution thus more 
likely to create outliers at any non-vanishing noise level. 

It is important to note that even though we have discussed the 
outlier problem solely in terms of moment-based measurements, 
shear estimates from model-fitting methods are susceptible to the 
same bias: Any filtering on catastrophic modeling failures is equiv- 
alent to outlier rejection, and if such failures become more promi- 
nent with increasing source ellipticity, the shape of the bias curves 
will follow the one in the top-right panel of Figure 9. 

4.2 Non-linear statistics 

Using X rather than e provides the advantage of a much simpler 
noise distribution with a closed description: the Marsaglia distri- 
bution of Equation 15. However, the relation between x ^rid g is 
non-linear. As before, we are thus faced with error propagation in 
a non-linear system and have to expect biased shear estimates even 
from a perfect measurement. 

The theoretically correct way of obtaining shear estimates 
from measurements of x is solving for the shear g that nulls the 
mean of the source-plane ellipticity (Schneider & Seitz 1995), 

S _ X ~ 2g + g X |.^Q^ 

^ 1 + |gP - 2R(gx*) ^ ' 

as the distribution of unlensed galaxy ellipticities is assumed to be 
isotropic, i.e. with zero mean. To our knowledge, the bottom-left 
panel of Figure 9 shows the first application of this estimator. While 
unbiased for all shears at zero noise, and being still unbiased for 
I' = 35 and gi < 0.15, the solver exhibits negative bias for the two 
faintest noise settings. 

In practice, Equation 29 is approximated to first order in the 
shear, which leads to Equation A2 and the usage of the so-called 
responsivity correction. Unsurprisingly, this simplified relation in- 
troduces its own bias that grows with the shear, which is shown in 
the bottom-right panel of Figure 9. Moreover, this simplified esti- 
mator does not perform any better than the fully non-linear solver in 
the left panel. In fact, our novel estimator appears to cope well with 
outliers in the noisy ellipticity distribution and to perform more 
reliably than the other three. It has the highest computational de- 
mands as it involves the minimization of an objective function, the 
modulus of x" in Equation 29, but can be implemented efficiently. 



4.3 Ellipticity weights 

Often, ellipticity estimates do not directly enter Equation 27, but 
get weighted before. Such a weighted average is tempting for two 




Figure 10. Ellipticity weights: Inverse variance denotes the scheme that 
attempts to minimize the mean square error, considering the ellipticity- 
dependent measurement noise (T„ and the intrinsic scatter CTe, according to 
w oc [a^ + ^e]"^ (Hoekstra et al. 2000). S/N denotes a scheme where the 
weight is directly proportional to the measurement significance, which of- 
ten favors circular objects. Measurement errors and S/N are obtained from 
the Deimos method, the intrinsic dispersion was assumed to be = 0.3. 



reasons. First, one can reduce the noise-induced variance by pe- 
nalizing faint galaxies. Second, one might even be able to reduce 
systematic biases by down-weighting galaxies from regimes, where 
the employed method yields consistently wrong ellipticity esti- 
mates. For instance, as the outlier problem increases with apparent 
ellipticity (cf. subsection 4.1) while the responsivity to shear be- 
comes weaker, applying larger weights for less elliptical galaxies 
seems logical. 

But a weighting scheme can introduce a bias on its own, even 
for perfectly unbiased, noise-free ellipticity estimates, namely in 
this seemingly beneficial case of ellipticity-dependent weights. If 
we replace the theoretically unbiased average (e) from Equation 27 
by its weighted and properly normalized equivalent 



(we) = 



J d?e wep{e) 
J d?e ■wp{e) ' 



(30) 



we can immediately see that the result is the same, i.e. unbiased, 
only if w does not depend on e. If the weight decreases with in- 
creasing ellipticity, we are confronted with an altered and more 
compact distribution 



j{e') = w(e')p{e'), 



(31) 



similar to the case of the outlier-rejected distribution. Even if we do 
not explicitly want the weighting scheme to penalize large elliptic- 
ities, the measurement method might report statistics of the effec- 
tive signal-to-noise level v that are turned into weights. Due to the 
more difficult task to measure large ellipticities, these statistics tend 
to implicitly depend on the source ellipticity (cf. Figure 10 for two 
plausible weighting schemes). For the simple case of an approxi- 
mately linear relation of the weight on e, described by the intercept 
Wo and the slope c, we show in Appendix C that the resulting bias 
on the shear estimate is to first order purely multiplicative. 



(we) - (e) ^ g 



TV Wo 



(32) 



where CTe denotes the dispersion of the ellipticity estimates. With 



reasonable values of 



-0.1, we obtain a bias of —2.5%. 



Also the two-point correlation function of (post-lensing) ellip- 
ticities 



(33) 
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Table 1. Overview of the ellipticity and shear estimation biases described in this work. 





O t ^ LlUll 


Effect 




En'or propagation 


section 2 


noise distribution shifted and skewed, average biased low 


this work 






strongly ellipticity-dependent 




Centroiding error 


subsection 3.1 


bias oc e/i/^ 


Bernstein & Jarvis (2002) 


Misalignment 


subsection 3.2 


bias most problematic for intermediate ellipticities 


this work 


Various shape estimation issues 


subsection 3.3 


deviation of measured ellipticity distribution from its expectation 


see subsection 3.3 


Ellipticity outliers 


subsection 4. 1 


Cross-talk between shear components (with outliers) 


this work 






or multiplicative shear understimation (after outlier rejection) 




Non-linear shear statistics 


subsection 4.2 


Multiplicative shear underestimation 


this work 


EUipticity-dependent weighting 


subsection 4.3 


Multiplication shear underestimation, negligible in correlation function 


this work 



which averages over pairs of galaxies with separation 9 = | Xi — Xj | , 
is expected to be sensitive to an ellipticity-dependent weighting 
scheme. Relevant for the estimation of cosmological parameters is 
the amplitude and shape of the shear correlation function ^g{0). A 
straightforward calculation (outlined in subsection CI) shows that 
the normalized correlation function is given by 



{wieiiUjEj) ^ 
{wiWj) w'l 



(34) 



where ag denotes the variance of the shear field. Since the con- 
stant terms in the equation above can be determined by looking at 
separations 6 where we expect the cosmological lensing signal to 
vanish, the correlation function is, surprisingly, largely unaffected 
by the weighting scheme. Only at very small scales, it is steepened 
due to the last term in Equation 34, but we expect this effect to be 
clearly subdominant compared to the influence of baryonic physics 
at these small scales. 

In contrast to the outlier rejection bias - which corresponds to 
a binary weight: either 1 or ~ this bias is less severe for large 
ellipticities, but it generally affects all galaxies, including those 
with smaller ellipticities, for which the creation of outliers is not a 
significant issue. Consequently, whenever weighted ellipticities are 
inserted into statistics that have been derived for unweighted ellip- 
ticities, correction terms (such as Equation 32) need to be applied, 
which requires accurate knowledge of the ellipticity-dependence of 
the weighting scheme. 



5 SUMMARY, CONCLUSIONS, AND OUTLOOK 

We compiled theoretical and practical evidence that pixel noise 
biases shear estimates at three subsequent levels: 1) the propa- 
gation of the pixel noise into the non-linear quantity ellipticity; 
2) additional uncertainties and biases introduced by ellipticity- 
measurement methods; 3) the application of statistics to infer the 
gravitational shear from a sample of ellipticity measurements, 
which are either unaware of the presence of pixel noise or them- 
selves non-linear and thus biased. 

We summarize our findings in Table 1. In practice, lensing 
analyses are affected by a combination of the aforementioned bi- 
ases, most of which are negative. Considering these findings, it 
is not surprising that every investigation of any shear estimation 
methodology the authors are aware of shows a weakening response 
to shear with increasing noise level. It is inevitable. Even though 
the details differ, the biases we revealed or reinvestigated generally 
not only depend on the object significance but also on its ellipticity. 
We therefore recommend to extend shear accuracy test programs to 
also inspect trends with ellipticity. 



That biases occur at several stages of the analysis pipeline 
leads to a unfortunate interdependence of the idiosyncrasies of the 
image data (foremost the galaxy ellipticity and signal-to-noise dis- 
tributions), shape-measurement method, and shear statistic. This 
means that a setup which has been found to work well in one sit- 
uation does not necessarily perform so well in others. Given the 
scope and fundamental nature of the majority of these biases, we 
do not believe that the needs of upcoming lensing surveys in terms 
of accuracy and reliability can be met without a substantial effort 
in coiTecting for or avoiding biased ellipticity and shear estimates. 

Method-dependent biases can be studied with simulated im- 
ages. Special attention should be paid to cases where the measured 
ellipticity distribution deviates from the expected Marsaglia distri- 
bution. For instance, a pile-up of samples shortly before the unit 
circle is a clear indication of a bias introduced by the shape mea- 
surement code in order to prevent unphysical outliers. 

To correct biases in an actual measurement, one needs to be 
able to identify those parameters that determine the bias. As we 
argued, this is mainly the significance and the ellipticity, but other 
factors, such as the radial profile or changes of the ellipticity with 
radius (Bernstein 2010), will also play a role. Therefore, correction 
schemes need to fully consider the performance of the shape mea- 
surement method as a function of all relevant parameters as well as 
the fact that these parameters themselves can only be inferred with 
a certain precision from the image data (Kacprzak et al. 2012). This 
is computationally and practically challenging. 

Instead of correcting ellipticity estimates, which also does not 
completely eliminate the problem of noise for the shear statistic, 
we should seek solutions that avoid the biases, for instance by stay- 
ing linear in the data for as long as possible (exemplified by stack- 
ing methods in Bridle et al. 2010). We conclude by sketching out 
an alternative idea. Since we worked out the theoretical form of 
the noisy ellipticity distribution, we are able to predict the (biased) 
outcome of a measurement given an assumed galaxy ellipticity and 
noise level. As we showed in section 3, we can also model the er- 
rors shape measurement methods exhibit once we assume to know 
the underlying galaxy parameters. We can thus ask the question: 
How likely is a certain ellipticity given a measured one and its 
measurement errors. The result will be an ellipticity likelihood for 
each galaxy that incorporates the non-linearities of the measure- 
ment process. Combining full likelihoods should finally lead to un- 
biased shear estimates. Whether this idea works in practice remains 
to be seen. 

To facilitate the review of our findings and to support forth- 
coming development, we make the code used in this work public. 
The PYTHON implementation comprises the computation of the 
Marsaglia distribution with efficient sample generation as well as 
all shear estimators of section 4 and is available at this URL. 
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Figure Al. EUipticity distributions of measured x' (top) and measured e' 
{bottom) from a synthetic catalog of 5,000 objects with intrinsic disper- 
sion tTe = 0.3, constant shear g = (0.15, 0.05), indicated by red cross 
markers, and pixel noise equivalent to = 15. The circle in the top plot 
con'esponds to the limit of intrinsic |e| < 1, the red points in the bottom 
plot indicate galaxies with | e' | > 1 . The distributions are created with the 
algorithm outlined in Appendix B. 

APPENDIX A: MOMENT-BASED ELLIPTICITY 
DEFINITIONS 

In Equation 5 we have introduced a two particular forms of the 
complex ellipticity, which are related according to (Bartelmann & 
Schneider 2001) 

X= — ^^ore= ^ (Al) 

Since the only significant difference between the two definitions 
occurs in the denominator, they share the same complex phase, 
but have different amplitude. There are two important distinctions 
between these definitions. First, the relation between gravitational 
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shear g and mean ellipticity is 

g = (e) or g = 



1 -direction. 



(A2) 



wliere the averages are taken over galaxies affected by a constant 
shear g. That means, e is an unbiased estimator of the shear (Seitz 
& Schneider 1997), while x needs a so-called responsivity correc- 
tion (the denominator in the equation above), and even then it re- 
mains an approximate estimator of g (see Viola et al. (201 1) for a 
discussion of higher-order corrections to this relation). 

The reason for the wide-spread use of x shear estimator 
rather than e stems from their second distinction, the distribution 
under pixel noise. If we assume the pixel noise to be uncorrelated 
and Gaussian, the second-order moments share this property due 
to their linearity in the data. However, both e and x ratios 
of combinations of second-order moments. In the case of nu- 
merator and denominator are linear combination of moments and 
thus still follow Gaussian distributions such that their ratio fol- 
lows the Marsaglia distribution of Equation 15, a generalization 
of the Cauchy distribution (Equation 8). The Cauchy distribution 
is known for its diverging variance, which allows arbitrarily large 
errors with finite probability, namely when Qn + Q22 0. In 
contrast, due to the non-linear term \/ Qi\Q22 — Q12 in the de- 
nominator of Equation 5b, e has a much more complicated distri- 
bution. Instead of allowing infinite errors, the additional term in the 
denominator can lead to a complex phase, which alters the orien- 
tation of e in the cases where x would come to lie outside of the 
unit circle of viable ellipticities. This effectively couples the two 
components of e and leads to two separate distributions, the ordi- 
nary one within the unit circle and the one on the unit circle, which 
renders a theoretical description much more difficult. 

We show the different distributions of e and x under noise 
in Figure Al. While x' shows a continuous distribution across the 
edge of the unit circle, outlier ellipticities (red points) are located 
right on the unit circle in e'-space. Performing a projection onto 
one component or computing the absolute value of e' would thus 
lead to an enhanced probability of measurements close to the unit 
circle. Moreover, global statistics such as the mean or the variance 
of the e' distribution are significantly altered by the presence of the 
ring at |e'| = 1. 

In summary, while e is theoretically an unbiased estimator of 
the shear, its distribution under noise effectively undermines this 
property. On the other hand, the distribution of x is much easier 
to describe and statistics thereof are fairly robust against noise, but 
the interpretation of these statistics is hampered by the non-linear 
relation to shear. One therefore needs to choose the ellipticity esti- 
mator based on the application at hand, depending on what kind of 
drawback can most effectively be dealt with. 



APPENDIX B: SAMPLING THE NOISY ELLIPTICITY 
DISTRIBUTION 

We present the recipe to simulate a realistic ellipticity distribution, 
considering the effects of non-linear error propagation discussed 
in section 2. Instead of sampling from the complicated Marsaglia 
distribution of Equation 15, we follow the path of the pixel noise, 
i.e. we compute the means and errors for the moment combinations 
w = Qii + Q22 and z = Qn + Q22 and then their ratio X = f;- 
For a fast and analytic moment calculation, we assume an el- 
liptical Gaussian shape for both galaxy and/or weight function and 
rotate into a frame, such that its semi-major axis is aligned with the 



W(x) — exp 1^ 



2s2 



(Bl) 



The size of the galaxy is then defined by s and its flux F — 
J cfxW{x). Errors of the moments, measured with weight func- 
tion W, are given by (Melchior et al. 201 1) 



<^i.j ~ '^n I d X W {x)xi^X2 , 



(B2) 



where cr„ denotes the pixel noise variance and i,j describe a mo- 
ment {W}^,J = / (fx W{X) x\x2- Note that this notation differs 
from the notation used throughout the rest of this work. In the case 
of a Gaussian-shaped W, these errors can be analytically evaluated. 
The algorithm can trivially be extended to work on moments from 
arbitrarily shaped galaxies, measured from noise-free images. With 
the convolution relation in moment-space (Melchior et al. 2011, 
Equation 9 therein), also the effect of the PSF convolution can be 
taken into account. 

1) For a source with ellipticity e, we compute the modulus e — |e|, 
its equivalent x from Equation Al, and the phase (ji = arg(e). 

2) For a Gaussian shape, the scale s defines the sum of second 
moments, w — Fs^ . Then, z = xw. 

3) We adopt the definition of significance from Erben et al. 
(2001, Equation 16 therein) and insert the error of the flux 
F (the {Vy}o,o moment in the notation used above): v — 
^ + e)(l — e). Thus, if we specify ly we get the pixel 
noise level (j„. 

4) The Gaussian errors aij for the second moments Qij are then 



given by 



2 2 2 37r 



4 (l-e)5(l + e) 

„6 



0"12 = O"!,! = 



"4 (1 -6)3(l + e)3 



(B3) 



222 Stt 

0"22 = 0-0,2 = 



4 (l-e)(l + e)5 

5) For the errors AQu and AQ22, we sample from a correlated 
bi-variate Gaussian, whose covariance matrix is given by 



C I fll PnCril(J22 

Oil, 22 ~ 1 2 

p„(Jlia22 Cr22. 



(B4) 



(cf. discussion that led to Equation 14). This can be realized by 
applying the Cholesky decomposition 



Sll,2 



AA' 



an 



(B5) 



to a vector of two A/'(0, 1) variates. 

6) Then, noisy samples w' = w ^ AQn + AQ22 and z — 
z + AQn — AQ22 exhibit the correct variances and correlation 
given by Equation 14, and their ratio xi = ;f7 is distributed 
according to the Marsaglia distribution of Equation 15. 

7) The second component of x> lias an independent numerator 
AQ12 ~ A/'(0, (T12), but the same denominator: X2 = ^^3^- 

8) The original orientation is recovered by x' ^ x'^"^- Then, e' 
can be obtained from Equation Al. 

Since the only terms entering the Marsaglia distribution are ratios 
of moments and errors, and both scale as s^, we can set s = 1. Sim- 
ilarly, the significance 1/ depends on the ratio of flux F and noise 
dispersion cr„, so that we can also set _F = 1 without changing the 
results. Effectively, one only has to specify the ellipticity e and the 
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significance to uniquely describe tiie effects of noise on the ellip- 
ticity. All information about the galactic shape, which determines 
the moments of the brightness distribution and their errors, is then 
internally computed (assuming a Gaussian radial profile). 

Realistic distributions, such as the ones shown in Figure Al, 
additionally need to start from a decent intrinsic distributions of 
ellipticities e" . With the common assumption of each ellipticity 
component being drawn from an independent Gaussian distribu- 
tion of some dispersion ~ 0.3, their modulus is drawn from 
the Rayleigh distribution /(e) = (cf. Figure 4 for 

an observed ellipticity distribution), and the orientation is uniform, 
U{0, n). Finally, a sheared ellipticity distribution can be obtained 
from (Seitz & Schneider 1997) 



e +g 
1 + e=g* 



(B6) 



We implemented the entire sampling procedure (as well as the 
theoretical form of the Marsaglia distribution and the shear estima- 
tors described in section 4) in PYTHON. The code is open-source 
and available at https://github.com/pmelchior/epsnoise. 



APPENDIX C: WEIGHTING-SCHEME INDUCED BIAS 

We seek an approximate analytical description of the bias an 
ellipticity-dependent weighting scheme introduces on globally act- 
ing statistics such as the average and the two-point correlation func- 
tion. Therefore, we choose a coordinate frame, which is aligned 
with the semi-major axis of each galaxy, such that we only have to 
consider the one component of e. We start by Taylor-expanding the 
ellipticity dependence of the weighting scheme to first order. 



w(e') = Wo + 



dw{e) 
de 



e' + 0(e'2). 



(CI) 



Assuming the most simple form of a positive offset wo > and 
small, constant slope dw{\e\)/d\e\ = c, we can simplify the previ- 
ous equation to 



w(e) — Wo 



+ ce if e > 
- ce if e < 0. 



(C2) 



Now we split the integrals in of Equation 30 into the lower and 
upper half (which allows us to apply the reduced weights also to 
negative e): 



assumption will restrict our derivation to small shears. We thus lin- 
earize the resulting integrals to first order in the shear and obtain 



de ej\f{e — g,ae) — / dt eM{e — g,ae) 



de e^f^{e — g,ae) — / de e^M{e — g,Oe) ~ 

J V TT 



(C4) 

Inserting all terms into Equation 30 yields Equation 32. The result 
is accurate to first order in w(e), n, and — . 

CI Shear correlation function 

In the limit of small shear, we can simplify Equation B6 between 
observed and pre-lensing ellipticities, e e" + g, which corre- 
sponds to the case above: the shear only shifts the ellipticity dis- 
tribution, but does not skew it. When considering the weighting 
scheme from Equation C2, we get the correlations functions of the 
weights 

{wiWj){e) = {{wo + cei){wo + cej)) = wl + c {eiej){9) (C5) 
and of the weighted ellipticities 



where (eiSj) is given by Equation 33 and 



(6?6|)(e) 



(C6) 



(C7) 



with (Tg = (g^) being the variance of the shear field. In this deriva- 
tion we assumed the intrinsic ellipticity field to be uncorrelated for 
separation ^ > and the shear field to be uncorrelated with the in- 
trinsic ellipticity field, such that terms like (fiigi) and (cigj) van- 
ish. Expanding the ratio of Equation C5 and Equation C6 to first 

2 

order in yields Equation 34. 



/ 



u u 
J de ep{e) - c j 



de 'w{e)ep{e) — Wo / deep{e) — c / de e p(e) 



(C3) 



+ Wo J deep{e) + c J dee^p{e) 



and likewise for the denominator. Since there is no sign-flip in the 
terms with wo, we can combine these integrals again and exploit 
/ dep{e) — 1 and J deep{e) = g. For the c-terms, we need to 
carry out the integration over the ellipticity distribution explicitly, 
simply because the presence of the shear shifts and skews the dis- 
tribution such that the c-terms in the equation above do not ex- 
actly cancel. We therefore assume the pre-lensing distribution to be 
Gaussian with dispersion and the shear to only shift the entire 
distribution without changing its shape: p{e) — >■ Af{e — g, ae). This 
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